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The most popular multiple testing procedures are stepwise proce- 
dures based on P- values for individual test statistics. Included among 
these are the false discovery rate (FDR) controlling procedures of 
Benjamini-Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] 
and their offsprings. Even for models that entail dependent data, P- 
values based on marginal distributions are used. Unlike such methods, 
the new method takes dependency into account at all stages. Further- 
more, the P- value procedures often lack an intuitive convexity prop- 
erty, which is needed for admissibility. Still further, the new method- 
ology is computationally feasible. If the number of tests is large and 
the proportion of true alternatives is less than say 25 percent, simu- 
lations demonstrate a clear preference for the new methodology. Ap- 
plications are detailed for models such as testing treatments against 
control (or any intraclass correlation model) , testing for change points 
and testing means when correlation is successive. 

1. Introduction. The need for multiple testing procedures (MTPs) has 
been given great impetus by diverse fields of application such as microarrays, 
astronomy, mutual fund evaluations, proteomics, disclosure risk, cytometry, 
imaging and others. Traditional methods to deal with multiple testing when 
the number of tests is large are deemed too conservative (i.e., they do not 
detect significant effects often enough). New approaches to multiple testing 
have arisen. Many of the new approaches are classified as stepwise proce- 
dures, such as step-up or step-down in contrast to single step procedures 
[see Hochberg and Tamhane (1987) and also Dudoit, Shaffer and Boldrick 
(2003), where 18 procedures are listed as single step, step- up or step-down]. 



Received July 2007; revised April 2008. 

^Supported by NSF Grant DMS-04-57248 and NSA Grant H-98230-06-0076. 

AMS 2000 subject classifications. Primary 62F03; secondary 62J15. 

Key words and phrases. Admissibility, change point problem, false discovery rate, like- 
lihood ratio, residuals, step-down procedure, step-up procedure, successive correlation 
model, treatments vs. control, two-sided alternatives, vector risk. 

This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Statistics^ 
2009, Vol. 37, No. 3, 1518-1544. This reprint differs from the original in 
pagination and typographic detail. 



1 



2 



A. COHEN, H. B. SACKROWITZ AND M. XU 



Among the more popular procedures is the Benjamini-Hochberg (1995) false 
discovery rate (FDR) controlling procedure. Many offsprings have followed 
[see, e.g., Efron et al. (2001), Storey and Tibshirani (2003), Sarkar (2002), 
Benjamini and Yekutieli (2001) and Cai and Sarkar (2006), just to mention a 
few] . Typically, the stepwise procedures deal with P- values determined from 
marginal distributions [see, e.g., Dudoit and van der Laan (2008), Chapter 3]. 
Even when the model entails random vectors with correlated variates, P- 
values from marginal distributions, ignoring correlations, are the basis of the 
procedures. 

Many multiple testing procedures are designed to control some error rate 
such as the familywise error rate FWER (weak and strong), fc-FWER [see 
Lehmann and Romano (2005)] and FDR. However, many researchers study 
the multiple testing problem as a finite action decision problem with a va- 
riety of loss functions [see, e.g., Lehmann (1957), Genovese and Wasserman 
(2002), Ishwaran and Rao (2003) and Muller et al. (2004)]. In these studies, 
the merits of the procedures are evaluated and compared by their risk func- 
tions. The risk function approach does not always necessitate the need to 
control a particular kind of error rate and can sometimes lead to procedures 
whose overall performance is preferred or even strongly preferred to an error 
controlling procedure. Whereas FDR control is appropriate for some situa- 
tions where the number of tests is large, there are many situations where one 
would prefer a procedure whose expected number of both type I errors and 
type II errors are smaller. Dudoit and van der Laan (2008) study expected 
values of functions of numbers of type I and type II errors. 

In a series of papers [Cohen and Sackrowitz (2005, 2007, 2008) and Cohen, 
Kolassa and Sackrowitz (2007)] demonstrated that, given a typical step-up or 
step-down procedure, there exist other procedures whose expected numbers 
of type I and type II errors are smaller. In fact, in Cohen and Sackrowitz 
(2008), for multivariate normal models when correlation is nonzero for two- 
sided alternatives, there exist procedures whose individual tests have smaller 
expected type I and type II errors. 

In this paper, we assume X is an M x 1 vector that is multivariate nor- 
mal with mean vector fi and covariance matrix F = ct^S. The matrix S is 
a known positive, definite nondiagonal matrix. The parameter cr^ is either 
known or unknown. In the latter case an estimator of cr^, which is a scaled 
chi-square variable, is available and this variable is independent of X. This 
is a classical linear model assumption. S is known since it is a function of 
the design matrix. We will demonstrate the new methodology in two impor- 
tant subclasses of this model. The first is the intraclass covariance matrix 
model, which characterizes the popular situation in which the variables are 
exchangeable. This model includes the problem of testing several treatments 
against a control. The second application is to the successive correlation co- 
variance matrix, which includes change point problems. We test two sided 
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alternatives (i.e., ffj : /Xj = vs. Ki: Hi ^ 0, i = 1, . . . , M). We also test one 
sided alternatives (i.e., H* : /ii < vs. K* : fii > 0, i = 1, . . . , M or Hi: fj, = 
vs. K*\Hi> 0). 

The goal of this paper is to develop good MTPs in the case of correlated 
variables. To begin with, we realize that every MTP induces individual tests, 
(^j, for the individual hypothesis testing problems Hi vs. Ki. The behavior 
of these tests should be of fundamental concern. However, the stepwise con- 
struction of most MTPs often makes it difficult to describe and study the 
individual tests. 

In particular, suppose an individual test induced by an MTP is inadmis- 
sible for the standard hypothesis testing loss. That is, for that individual 
hypothesis testing problem, a test exists whose size is no greater than the 
stepwise procedure test and whose power is no less with some strict inequal- 
ity. It would then follow that the overall procedure would be inadmissible 
whenever the risk function is a monotone function of the expected numbers 
of type I and type II errors. 

As a first step, we find a convexity property that is necessary and sufficient 
for admissibility of the individual tests. In Cohen and Sackrowitz (2008), it 
has been shown that most popular stepwise procedures do not possess the 
convexity property when there is correlation in the two-sided alternative 
case. Next, we construct a step-down type MTP whose individual tests do 
have the required convexity property. As is typical in problems where no 
single optimal procedure exists, the selection of a procedure is somewhat 
subjective. In evaluating procedures, we focus mainly on the expected num- 
ber of type I and type II errors that the procedures make. 

The new stepwise testing method proposed is based on the maximum of 
adaptively formed residuals. The method is called maximum residual down 
(MRD). The MRD method has several advantages over the stepwise methods 
that are currently recommended in the literature: 

(1) The main justification for MRDs is the fact that MRD tests take into 
account the correlation among the M variates. Thus, MRD utilizes infor- 
mation oftentimes not used by the current P- value methods. This property 
of the MRD procedure is the likely explanation for the apparent improved 
overall performance of MRD when compared to the P-value methods based 
on marginal distributions. 

(2) MRD procedures have an intuitive and desirable convexity property 
required for admissibility. Whereas admissibility is not in itself a compelling 
property, inadmissibility can be a serious shortcoming. 

(3) For the treatment vs. control and change point models for large and 
relevant portions of the parameter space, simulations demonstrate that the 
MRD method makes substantially fewer mistakes than the popular FDR 
controlling procedures. In particular, if the proportion of true alternatives 
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is less than 25 percent of the total number of tests, then the simulations are 
somewhat convincing that this method is quite good. 

(4) The MRD method is applicable in all cases where S is known. 

For arbitrary S and M extremely large, the procedure essentially requires 
inversion of a larger size matrix. This could be computationally difficult. The 
level of difficulty depends on the structure of S. In the two popular mod- 
els, we consider S can easily be inverted regardless of how large M is. The 
first model is intraclass. The concept of intraclass covariance matrix was 
introduced by Rao (1945). Subsequently it has been discussed in articles in 
behaviorial genetics and statistics [see, e.g., Carey (2005) and Krishnaiah 
and Pathak (1967)]. Such a model is appropriate whenever the components 
of X have a multivariate distribution that is exchangeable. In particular, all 
variances are equal and all covariances are equal. The intraclass correlation 
matrix is appropriate for the model of testing each of M — 1 treatments 
against a control. MRD is readily applicable here, since inversion of the 
appropriate covariance matrix is easily facilitated. The second model is suc- 
cessive correlation. This model has a constant nonzero correlation coefficient 
between adjacent pairs of variables. All other correlations are zero [see Kr- 
ishnaiah and Pathak (1967)]. This model presents no computational issues 
even if M is extremely large. A special case of this model is the change 
point problem [see Chen and Gupta (2000)]. We will see in Section 6 that 
the MTP method discussed in Chen and Gupta (2000) is based on many 
collections of pooled means. This is precisely the set of statistics given by 
the MRD method applied to this very special case. In a sense, this vali- 
dates our very general approach, even though our method uses the statistics 
differently than in Chen and Gupta (2000). 

A seemingly logical step-down method that would take correlations into 
account is to successively perform likelihood ratio tests (LRT) of global 
hypotheses. That is, one could employ the closure method [see Marcus, 
Peritz and Gabriel (1976)] using an LRT, at step one, for /x = vs. fi ^ 
0. If the global test rejects, then eliminate the variate corresponding to 
maxi<j<jvf One continues in a step-down fashion in determining the 
LRT-based MTP. Cah this procedure LRSD. 

When T, is intraclass for two-sided alternatives, LRSD is admissible for 
any monotone collection of critical constants only when M = 2 or M = 
3. For M > 4, counterexamples abound. That is, there are many critical 
constants for which LRSD is inadmissible. Furthermore, critical constants 
are found for M > 5, which relate to constants that are likely to be used. 
This inadmissibility of LRSD is what prompted and led us to MRD. 

For one-sided alternatives when S is intraclass, LRSD is admissible even 
in cases where the common variance o"^ is unknown (provided replications 
of the observations are taken). In this instance, LRSD can be a competitor 
to MRD, and this is reflected in the simulation study in Section 7. 
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We note there that M is taken to be 100. Large values of M entail com- 
putational problems for LRSD, since under the alternative the parameter 
space is constrained, and the software needed to carry out the tests for M 
much greater than 100 is very time consuming. For the treatments vs. con- 
trol model, one might think that the P-value based step-down procedure, 
based on the analogue of Dunnett's one sided tests of global hypotheses, 
might also be a competitor [see Westfall and Young (1993), Section 3.2.1]. 
In this instance, dependency is taken into account when determining criti- 
cal values. Nevertheless, it does not take correlation into account in the test 
statistics, and, overall, the procedure does not fare well in the simulation 
study. 

As previously mentioned, another problem of interest is testing for change 
points in a sequence of M + 1 independent normal trials. Sometimes, it 
is assumed that the means are nondecreasing, in which case the model is 
referred to as a simple order model. One seeks to determine whether a change 
in mean has occurred at particular time points. The alternative at each time 
point is either two-sided or one-sided. In either case, the LRSD step-down 
method is mostly inadmissible, while the MRD method is admissible. 

Returning to the general case, we remark that if S is unknown but repli- 
cations are available, an estimator of S can replace it in the MRD method. 
We cannot claim the optimality properties, but, nevertheless, the method 
is viable. For large numbers of replications, even the normality assumption 
may not be crucial. 

In the next section, we describe the MRD method. In Section 3, we prove 
that the MRD method is admissible for the vector risk where each compo- 
nent of the vector is the testing risk for an individual test. Admissibility 
depends on whether each individual test function has an intuitive convexity 
property. Section 4 is concerned with the LRT based step-down procedure 
(LRSD). Here, there are both admissibility and inadmissibility results of 
interest. Section 5 contains a geometric connection between the MRD and 
LRSD methods, some other interesting interpretations and some figures re- 
lated to the geometric interpretation. Results concerned with testing several 
treatments vs. control, the change point problem and the successive corre- 
lation model are given in Section 6. Simulations and analyses are given in 
Section 7. Most proofs appear in the Appendix. 

2. MRD method. Assume X = (Xi, . . . , Xjv/ )' is distributed according 
to a multivariate normal distribution with mean vector /x and covariance 
matrix cr^S = a^{aij). The matrix S is assumed known, and, for now, we 
take C7^ to be known and, without loss of generality, let = 1. The two-sided 
multiple testing problem is test 





i = l,...,M- 



6 A. COHEN, H. B. SACKROWITZ AND M. XU 

We will also consider one-sided alternative problems 

(2.2) Hi -.1^ = vs. K*:iJ.i>0 
and 

(2.3) H*:i2i<0 vs. K*:i^h>0. 

For now, we focus on the two-sided case (2.1). 

By way of notation, X^*!'*^'---'*'") is the (M — r) vector consisting of the com- 
ponents of X with Xi-^, . . . ,Xi^ left out. is the (M — r) x (M — r) 

covariance matrix of X^*^'"''*''). ^j^g _ ^ 1 vector of 

covariances between Xj and all variables except Xj^, . . . ,Xi^^__^^ , and Xj. 

_ (nv,«(m-l))'^-l (n.---:«(m-l)) 

is the conditional variance of Xj , given all variables except Xi-^ , Xi^^_^^ , Xj 
Now, define 

c/(:}--^--)(x) 

(2.4) 

= (X,-^;*.V-'*''"-^'^'s"i . ,x(^^'-'H™-)'^'))/a,'/' . , 
for m = 1, . . . , M. 

The m subscript represents the stage of the MRD procedure. Note that 



^ ^j^, _ ^q{Xj.|x(^1'-'*(— i)'J)})/y'var(Xj'|X(^i'-'*{— D'J')), 

where £"0 is taken under fx = 0. 

We now describe a general class of stepwise down procedures, given a set 
of M2^-^ functions C/„j(x). At most, M(M + l)/2 of these needs to be 
calculated to carry out the procedure. The m index ranges from 1, 2, . . . , M 
and represents the mth stage. At stage m there are M — m + 1 functions. 

Let Ci > C2 > • • • > C]\i > be a given set of constants. At stage 1, con- 
sider C/ij(x), J € {1, . . . , Af}. Let ji = ji(x) be such that Uij^ (x) = maxj \ Uij{x 
If ?7iji(x) < Ci, stop and accept all Hi. Otherwise, reject Hj^ and continue 
to stage 2. 

At stage 2, consider M - 1 functions U^f\x), j G {1,...,M} \ {ji}. 

Note that U2j^\'K) just depends on x^-^^^. Let j2 =i2(x^-'^^) be such that 

U2j2 = maxj iC/gj^^l) j £ {!) • • • 1-^^} \ {ji}- If t^2j2 < ^"2, stop and accept all 
remaining null hypotheses. Otherwise, reject Hj^ and continue to stage 3. 
In general, at stage m, m = 1, . . . ,M , consider M — m + 1 functions 

uii]-^'-''\x), ie{l,...,M}\{ji,...,i(„„i)}. Note that U^^-'^-'^^ 

depends on Let jm = im(x*--'^'"''-'(™"^''') be such that 
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Umj„, = maXj \Umj'"''^^"" I, j e {1, . . . ,M}\{ji, . . . , j(m-l)}- If f^mj™ < C"™, 

stop and accept all remaining null hypotheses. Otherwise, reject Hj^ and 
continue to stage m + 1 (unless m = M, in which case, stop). 

The above MTP determines test functions for each individual testing 
problem. Let (j)u{^) denote the test function for testing Hi: = Q vs. 

Note that, at the beginning of Section 7, we offer a discussion regarding 
the choice of Ci , . . . , Cm ■ 

3. Admissibility of MRD. We will demonstrate that for each individual 
testing problem that the MTP based on the MRD method is admissible. 
Without loss of generality, we focus on Hi vs. Ki and start with the case 
a known (o"^ = 1). Our plan is to use a result of Matthes and Truax (1967) 
which offers a necessary and sufficient condition for admissibility of a test 
of Hi vs. Ki when the joint distribution of X is an exponential family. We 
will state this result for our model as Lemma 3.1. We next demonstrate, in 
Lemma 3.2, that the Umj function given in (2.4) has certain monotonicity 
properties. These monotonicity properties will enable us to prove, in Lemma 
3.3, that the individual test functions for Hi vs. Ki have a convexity property 
that is necessary and sufficient for admissibility. Theorem 3.1 summarizes 
and states the admissibility of the MRD procedure. 

Now, we express the density of X as 

(3.1) /x(x|,x) = (1/(2^)^^2 |5.|i/2) _ ^)'s-i(x - fx), 

which, in exponential family form, is 

(3.2) /x(x|/x) = /i(x)/3(/i)expx'S-V- 
Next, let Y = S^^X so that 

M 

(3.3) /Y(y|At) =/i*(y)/3(At)exp^yi/ii. 

i=l 

Lemma 3.1. A necessary and sufficient condition for a test (/'(y) of 
Hi : /ii = vs. Ki : /ii ^ to he admissible is that, for almost every fixed 
y2, ■ ■ ■ ,yM , the acceptance region of the test is an interval in yi. 

Proof. See Matthes and Truax (1967). □ 

Note that, to study the test function 0(y) = (pui^) as yi varies and 
(7/2, • • • , yA/) remain fixed, we can consider sample points x + rg where g 
is the first column of S and r varies. This is true, since y is a function 
of x, and so y evaluated at (x + rg) is S~^(x + rg) = y + (r, 0, . . . , 0)' = 
{yi+r, 2/2, •••,2/1./)'- 
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From here on it will be convenient to express the functions ^^j'""^'""'^ W 
of (2.4) simply as Umj{x)- No confusion should ensue. 

Lemma 3.2. The functions Umji^) given in (2.4) have the following 
properties. 

As a function of r, 

(3.4) [/^i(x + rg) = C/„i(x)+r. 

Form = l,...,M;j£{2,...,M}\ {ji, . . .,jr,^-l}, Ji ^ 1, • • • , jm-i 7^ 1, 

(3.5) C/mj(x + rg) = C/mj(x). 

Proof. See Appendix. □ 

Lemma 3.3. Suppose that, for some x* and ro > 0, (j)u{x.*) = and 
<pu{'^* + rog) = 1. Then, <pu{'^* +rg) = 1 for all r>rQ. 

Proof. See Appendix. □ 

Note that Lemma 3.3 implies that the acceptance region in yi, for fixed 
2/2, • • • , ym is an interval. 

Theorem 3.1. For the two sided case, the MRD procedure based on 
{Urnj} is admissible. 

Proof. Admissible means that each individual test for each hypothesis 
testing problem is admissible. Without loss of generality, we show admissi- 
bility of (/>c/(x) for Hi vs. Ki. Proof that the other tests are admissible for 
the other hypotheses would be done the same way. That (pui^) is admissible 
for Hi vs. Ki follows readily from Lemmas 3.1 and 3.3. □ 

For the case where cj^ is unknown, we assume that we have available an 
unbiased estimator of cr^ with the property that vs^ ja^ is a X"^ variable 
that is independent of X. In this situation, we write the joint density of 
(X,.2) 

= (z.(i.52)'^/2-V(27r)^/2 . 2'^/2r(^/2)(^2)(Af+.)/2|5.|l/2) 

(3.6) 

X exp(-l/2cr2){(x - At)'S^^(x - /i) + vs^} 

= H{x,T)B{fi,a'^)exp{x.'^-^fi/a'^ - (l/2cj^)T}, 

where T = z/s^ + x'S~^x. Note that the change from (x, s^) to (x, T) limits 
the values of x in the sample space to those for which 
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The MRD method now utilizes statistics Umj(^)/T^^'^ ■ All lemmas and 
Theorem 3.1 hold, with T as well as y2,---,yM fixed, where, once again, 
y = S-ix. 

For one-sided alternatives specified in (2.2) and (2.3), the MRD method 
simply uses Umj in place of |f7mj|- The result of Theorem 3.1 can be proved 
similarly. 

4. Likelihood ratio step-down method (LRSD). Assume that X is dis- 
tributed as multivariate normal with unknown mean vector fi and known 
intraclass covariance matrix S. Without loss of generality, we take the di- 
agonal elements of S to be 1 and the off diagonal elements to be p. The 
LRSD method is to test by the LRT, at stage 1, the global hypothesis 
HiG : /X = vs. KiG : /I 7^ 0. If Hig is not rejected, then stop and accept 
all Hi, i = 1,...,M. If HiG is rejected, then reject Hj^, where ji is the 
index for which l^jj = maxi<j<M and continue to stage 2. At stage 
1 use the critical value Ci. At stage 2, test, by the LRT the global hy- 
pothesis, H2G ■■ M^^'^ = vs. K2G ■■ /^^^'^ / 0, where fi'^^^^ is the (M - 1) x 1 
vector of means that are the same as fx save fij^ is left out. Use the crit- 
ical value C2 < Ci. Proceed as in stage 1. At stage m, test by the LRT 
HmG : = vs. KmG '■ / and so on. At stage m, use 

the critical value Cm < Cm-i- 

We will demonstrate that the LRSD is admissible for M = 2 and M = 3. 
For M > 4 there exist counterexamples for certain collections of critical val- 
ues and certain values of p. We offer a counterexample when M = 4, and, 
when M = 5, we demonstrate inadmissibility for a large class of practical 
critical values for logical values of p. In fact, for large M using critical val- 
ues, it turns out that for most p values (p 7^ 0) counterexamples demonstrate 
that LRSD is inadmissible. 

On the other hand, should the alternatives for the individual hypotheses 
be the one-sided alternatives given in (2.2), then the LRSD is admissible. 

When the alternative is two-sided, the results of Section 3 imply that 
admissibility of a test for an individual hypothesis testing problem (say Hi 
vs. Ki, without loss of generality) is determined by whether the conditional 
acceptance region in yi given (7/2, • • • ,yM) is an interval. (Recall y = S~^x.) 
When the alternative is one-sided, the conditional acceptance region is a left 
sided half line. 

Focusing first on the two-sided alternative case, we note that the LRT for 
HiG vs. KiG is to reject if 

(4.1) x'S~^x>Ci, 

where 



(4.2) 



E-i = (l/(l-p)){/-G(ll')} 
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and G = /^/(l + (M 



l)/3). As such, 
[l/(l-p)(l + (M 
/l + (M-2)p 



1 



1)P)] 



-P 



(4.3) 



X -P 
\ -P 



(l + (M-2)p) 



-P 



-P 
-P 



-P 
-P 



(l + (M-2)p) 



Again, let g be the first column of S. 
Our first result in this section follows. 

Theorem 4.1. For the two-sided alternative case, LRSD is admissible 
for M = 2 andM = 3. 

Proof. See Appendix. □ 

For M = 4, we exhibit a set of critical values for which LRSD is inadmis- 
sible. To do so, we find a sample point x* at which Hi is rejected and for 
which Hi is accepted at x* + 7g. In fact, let x* = (a, —a — A, b, —b — e)' for 
b> a + A> a> and e > 0. Thus, using (4.1) at stage 1, choose Ci so that 
x*S~^x* > Ci so that H4 is rejected and variable x| is eliminated at stage 
2. At stage 2, we calculate 



(4.4) =[l/(l + p-2p2)] 

X {(1 + p)b^ + 2a^{l + 2p) + 2A[a + 2ap + pb + {l + p)A/2]}. 



We set x*(^)'llj.^^x*(^) = C2. At stage 3, H2 is rejected, and, at stage 4, Hi is 
rejected. Now, if p > 0, let 7 = e/p and note that (x* + 7g)'S~^(x* + 7g) > 
Ci. This time, however, H^ is rejected at stage 1. At stage 2, we calculate, 
for 7 = e/p, 



X 




(4.5) 



(x*{3)+^g{3)ys-i(x*(3)+^g(3)). 



We note that (4.4) minus (4.5) is 




There are many choices of a, 6, A,e,p, 7 for which (4.6) is positive (e.g., 
a = 2,6 = 4,A = l,e = 0.1,p = 0.5,7 = 0.2). The fact that (4.6) >0 implies 
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that at X* + 7g the overall procedure rejects i/3 and accepts Hi,H2 and H4. 
Note that, since > 0, x* — xig is an accept point. Now, if Hi is rejected 
for X = X* but accepted for x* + 7g, it is implied that the test for Hi is 
inadmissible. 

For M = 5, it can be shown that if the critical values correspond to critical 
values of chi-square with m degrees of freedom, m = 1, 2, 3, 4, 5 at level, say 
0.05, then, for most values of p, LRSD is also inadmissible. The same is true 
for any M > 5. 

Next, for the intraclass model, we consider testing one-sided alternatives 
(i.e., we test iJj : = vs. K* : /Xj > 0). The LRSD method in this case is the 
same as in the two-sided alternative case, except that l^jj is replaced by 
Xj-^ = max(Xi, . . . ,Xm) and similarly at subsequent stages. For this setup 
we have the following theorem. 

Theorem 4.2. For the one-sided alternative case, LRSD is admissible. 

Proof. See Appendix. □ 

The final result of this section deals with the intraclass model when the 
covariance matrix is of the form cr^S, where cj^ is unknown and S is as 
before. This time, however, a random sample Xq,, a = 1, . . . ,n, is taken 
from a normal distribution with mean vector /x and covariance matrix cr^S. 
The alternative hypotheses are one-sided, and the global likelihood ratio test 
is based on X = J2a=i ^a/n and T = X)q=i X^S~^Xq. Using the fact that 
X, T have an exponential family distribution and arguments similar to those 
used previously, it can be shown that the LRSD procedure is admissible in 
this case as well. 

We remark that this model ensues for the problem of testing M treatments 
against a control when it is assumed that the mean for each treatment is 
greater than or equal to the mean for the control. More details are given in 
Section 6. 

5. Geometric and other interpretations. LRSD compared statistics of 
the form x'S^^x to critical values in order to test global hypotheses at each 
stage of the process. The overall acceptance region for the global testing 
problem is therefore an ellipsoid. The individual statistics HJmj given in 
(2.4) determine the MRD method. These statistics represent pairs of sup- 
porting hyperplanes to the ellipsoids determining acceptance regions of the 
global hypotheses at stage m [see Scheffe (1959), page 69]. The particular 
hyperplanes are tangent to the ellipsoids at sample points on the ellipsoid 
for which all but one coordinates are zero. If the probability, under a global 
null hypothesis of a mean vector, is zero, specified at, say 7, then the proba- 
bility of the ellipsoid is 7. The acceptance set determined by the supporting 
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hyperplanes would be larger than 7. However, should one desire this set 
to have probability 7, then the hyperplanes would support a smaller ellipse. 
Figures 1 and 2 depict such sets for the stage when there are only two means 
left to test. 

Note that, by comparing the maXj\Umj\ to a critical constant, one is 
determining an acceptance region for the global hypothesis at stage m by 
using the union-intersection procedure [see, e.g., Casella and Berger (2002), 
page 380]. 

The statistics Umj appear in the identity in Anderson (1984), Exercise 54, 
Chapter 2. Thus, one can express the MRD method alternatively in terms 
of x(^)'S7|x». 

The statistics Umj are also the focal point in determining change points 
in the methodology offered by Vostrikova (1981). 

Although MRD uses Umj as does Vostrikova (1981), the methodologies 
are different. We discuss this further in Section 6. 

Remark 5.1. It is interesting to note that the MRD method is not 
P-value monotone in the sense of Hommel and Bernhard (1999). That is, 
an MTP is monotone with respect to P-values if Pi < Pj and Hi is not 
rejected, then Hj cannot be rejected. As indicated in that reference, P-value 
monotonicity is not always desirable. 




Fig. 1. LRSD ellipse with supporting hyperplanes in two dimensions. 
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Fig. 2. LRSD ellipse with suppoHing hyperplanes shrunk to match size. 

Remark 5.2. Use of Umj converted to P- values can be thought of as 
using conditional P- values at stage m for a centered variable j, conditioned 
on the other remaining variables, assuming all nulls are true. 

6. Treatments vs. control, change point and successive correlation mod- 
els. The first two models of this section entail independent random samples 
from (Af + 1) normal populations. Let Zij, i = l,. . . ,M + 1, j = 1, . . . ,n, be 
N{vi,a'^). In the treatments vs. control model, the treatments correspond 
to i = 1, . . . ,M, while the control population corresponds to the (M + l)st 
population. Let Xi = Zi — Zm+i, « = !,..., M, so that X is distributed as 
multivariate normal with mean vector /x, jXi = Vi — vm+i and covariance 
matrix (2cj^/n)5], where S is intraclass with diagonal elements 1 and off 
diagonal elements 1/2. Should cr^ be unknown, then an unbiased estimator 



of i 



IS 



M+1 n 



Y^Y.^Z,3-^^?/{M + l){n-l) 



i=l j=l 
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Furthermore, {M + l)(n — l)s^/o"^ is distributed as chi-square with (M + 
l)(n — 1) degrees of freedom and is independent of Z and hence X. We recog- 
nize that, in terms of X and T, we have a special case of the model of Section 
2 and, in fact, we have one of the models of Section 4. For this problem then, 
MRD is an admissible procedure for two-sided as well as one-sided alterna- 
tives. The LRSD procedure is admissible for one-sided procedures and for 
two-sided procedures for M = 2 and 3. For M = 4, however, many coun- 
terexamples to admissibility exist. In the next section, we use simulations 
to evaluate MRD and compare it to the popular step-wise procedures that 
are based on P-values from marginal distributions. 

The model for the change point problem also entails M + 1 independent 
random samples from normal populations. Let Zij be as in the previous 
setup, only this time we are interested in M null hypotheses Hi: iXi = i/j+i — 
f i = vs. Ki: fii ^ for two-sided alternatives or : fii > for one-sided 
alternatives, i = 1, . . . ,M. Let Xi = Zi^i — Zi, so that X is distributed as 
multivariate normal with mean vector /x and covariance matrix ((T^/n)S, 
where S = (aij ) , and 

aii = 2,aij = -1, if\i-j\ = l, aij = 0, 

(6.1) 

otherwise, i,j = l,..., M, i ^ j. 

Note that a rejected Hi amounts to infering that a change in mean has 
occurred from time i to time (i + 1). One seeks to identify all change points. 
There is a substantial literature on the change point problem [see, e.g., 
Chen and Gupta (2000), where reference is made to the binary segmentation 
procedure (BSP) due to Vostrikova (1981)]. 

For this problem, one can consider a number of approaches. Among them 
are MRD, LRSD and BSP, the usual step-up and step-down procedures 
based on P-values. There is a very interesting connection between the MRD 
and BSP methods. Both are based on the Umi statistics given in (2.4). This 
is further support for our general methodology since, in this special case, 
our statistics are precisely the statistics Vmi used by Vostrikova (1981) for 
the change point problem. MRD and BSP use the statistics differently. 

We now demonstrate that Umi are the same as Vmi and note that the Umi 
statistics can be computed readily for any size problem, since it will not be 
necessary to actually invert any matrix or submatrix of S as given in (6.1). 
Toward this end, for 1 < p < M, define the p x p matrix 

/ 2 -1 ••• \ 
-1 2 -1 ••■ 



.2) S(p) 



••■ -1 2 -1 

V ••• -1 2 y 
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Note that S(M) was given in (6.1). It is easily verified that the first and 
last rows of S-^p) are {l/{p + l)){p{p - I) ■ ■ ■ 1) and (l/(p + 1))(12 • • -p), 
respectively. 

Suppose now that, at stage m, we have not eliminated xi (i.e., not re- 
jected xi at an earlier stage) but have eliminated Xj^, . . . ,Xj^_-^ (i.e., rejected 
Hj-^, . . . , Hj^_-^). For this development, we may take ji < j2 < • ■ • < jm-i 
without loss of generality. Let r be an index, < r < m, and let jo = 1, 
jm = M. We are now ready to prove the following theorem. 

Theorem 6.1. For jr < i < jV+i, the statistics Umi of (2.4) can be ex- 
pressed as 



(6.3) 



[(ir+l - jV)/(jr+l ir)]^/^ 



\-j=jr + l \j=jr + l / 



Proof. See Appendix. □ 

Remark 6.1. It can be shown that the BSP procedure is also admissible. 

Remark 6.2. For the change point model when M > 4, it can be demon- 
strated that the LRSD method is frequently inadmissible both for two-sided 
and one-sided alternatives. 

The successive correlation model starts with an M x 1 random vector X, 
which is multivariate normal with mean vector fj. and covariance matrix 







/I P 





••• 











P 1 


p 


••• 












p 


1 


p ••• 








(6.4) 


S(M) = 


















••• 


1 


p 






\0 





••• 


p 


1/ 


Note that if i;(0) 


= 1, then for r = 


0,1, 


..,M, 






(6.5) 


|S(r)| = 


= |S(r- 


-1)1 






2)|. 



Also, one can verify that the first row of the inverse of S(r) is ((ii,r j • • • j c^r,r) 
where 



(6.6) 



d.,r={-pr'\nr-^)\/\nr)\■ 
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By symmetry, the last row is {dr^n ■ ■ ■ , dir)- Proceeding as in the change point 
model case when Xj^,. . . , Xj^_-^ have been eliminated and jr <i < jr+i, the 
numerator of Umi [see (A. 7)] is 

- (0,...,0,/),p,0,...,0) 

/S(ii-l) \ 



(6.7) 



X 



where the row vector above is of order (M — m) x 1 and has two entries of 
p in positions i — 1 and i. If i = 1 or M, then there is only one entry of p. 
Defining si and S2 as in the change point model, we find that (6.7) is 

(6.8) X, + (0, . . . , 0, d,,s, , . . . , , ds,,s2 0, . . . , 0)x(*'^'i'-'J™-i), 

where the nonzero entries in (6.8) appear in positions jV + 1, jV+i — 2. Thus, 

(6.8) becomes 

Sl S2 

(6.9) Xj + ^ ^ ^j)Sl-^jr+j ~l~ ^ y ds2~j + l,S2'^i+j- 

i=i 

The denominator of Umi is 
l-(0,...,0,p,/9,0,...,0) 
(6.10) 

/S(ii-l) 



S(si) 



5^(^2) 



V 



E(M-j„-l)/ 





p 



Vo/ 



where the vectors are (M — m) -dimensional with two entries of p in the 
{i — 1) and i positions. If i = 1 or M, then there is only one entry of p. Thus, 
only the (si,si) element of S~-'^(si) and the (1,1) element of Tj~'^{s2) will 
be needed. Specifically, we get 



(6.11) 



^ ^,^ is(gi-i)i ^ \ns2-i) 



|S(si 



\ns2)\ 
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7. Simulations. The MRD procedure can be viewed as a family of ad- 
missible procedures parametrized by a set of constants {Ci, . . . , Cm}- It can 
be shown, using an inequality due to Sidak (1968), that {Ci, . . . ,Cm} can 
be chosen so that the MRD procedure controls the strong FWER. However, 
such a choice of Cs would be extremely conservative and would sacrifice the 
gains achieved by MRD, which takes advantage of the correlation among the 
variables. It is also possible to choose smaller C"s to control FDR. However, 
this too is likely to lead to an overly conservative procedure. To determine a 
reasonable set of constants, one must study the risks (errors and error rates) 
for various choices of constants. As is the case in a typical decision theory 
problem where no optimal procedure exists, one must choose from a number 
of admissible procedures. This process needs to be done prior to looking at 
the data. To make this choice in practice, one must consider the particular 
application. In the examples we present, the number of hypotheses is very 
large, and so one expects only a small or modest percentage of alternatives 
to be true. Thus, we focused on that portion of the parameter space where, 
at most, 25% of alternatives were true. A large variety of sets of constants 
were evaluated through simulation. Those presented gave a good balance 
of performance in terms of expected numbers of type I and type II errors 
committed. 

We have seen, in Section 3, that the MRD procedures possess the intu- 
itive convexity property needed for admissibility regardless of the covariance 
matrix, cr^S. These stepwise procedures make extensive use of the covari- 
ance structure at every stage. To see the types of improvements that can be 
made over usual stepwise methods, we now present some simulation studies. 
In this section, we report results for the treatments versus control model in 
both the cr^ known and unknown cases. We also look at the change point 
model. 

Our studies focus on the situation where the number of populations is 
large and the number of true alternatives is less than 25%. For two-sided 
alternatives, we present a comparison of the MRD method with either the 
step- up or step-down method (whichever did best in the given situation). 
The step-up and step-down methods used in the comparison are those based 
on P-values determined from marginal distributions. We report the expected 
number of type I errors, the expected number of type II errors and the 
FDR. To obtain the probabilities of types I and II errors we can divide the 
expected number of errors, in the tables below by the number of true nulls 
and alternatives, respectively. For all simulations, we used 1000 iterations. 

Table 1 gives the results for the treatments versus control model (i.e., 
intraclass with a correlation coefficient of 0.5) with a known = 1 for two- 
sided alternatives. The step-up procedure in the table is the Benjamini- 
Hochberg (1995) FDR controlling procedure where FDR = 0.05. Thus, the 
critical values for the step-up procedure are [^~^{0.05i/2M)]. The critical 



18 



A. COHEN, H. B. SACKROWITZ AND M. XU 



values for MRD are somewhat related to the FWER controlling step-down 
procedure where the control is at level 0.05. Specifically, these critical values 
for MRD are as follows. For a = 0.05, M = 10,000, Ci = $"^(1 - a/2M), 
Ci = 0.71$"^ (1 - q/2(M -i + l)) and 1 < i < M. These critical values were 
selected by trial and error using simulations with 1000 iterations. They were 
chosen so that a desirable procedure would ensue and also to suggest a way to 
get critical values in other cases. Another consideration was to try to match 
step-up in FDR when the number of nonnulls is large. Here, M = 10,000, and 
the results are most dramatic. There is improvement (usually substantial) 
in both the expected number of types I and II errors. 

Table 2 also gives results for the treatments versus control model (i.e., 
intraclass with a correlation coefficient of 0.5) but with unknown cj^ for two- 
sided ahernatives. Here, M = 3000, n = 10, a = 0.05, Ci = $"^(1 - a/2M), 
Ci = 0.63$"^ (1 - a/2{M - i + 1)) and 1 < i < M . The step-down procedure 
in the table is based on P-values of the marginal distributions of t-statistics 
with a pooled estimate of cr^. The step-down procedure controls FWER at 
a = 0.05. 

Table 3 deals with the change point model for two-sided alternatives. 
Unlike the intraclass model, the variables are not exchangeable. Thus, the 
pattern of true mean values as well as the choice of true mean values 
impacts the operating characteristics of the procedures. It would be diffi- 
cult to select a particular portion of the parameter space to study with- 
out knowing the specific application. The type of pattern in mean values 
we present reflects the notion of an occasional rise in mean value as fol- 
lows. The sequence of differences in consecutive means are of the form 
0, . . . , 0, 1, 1, 1, 0, . . . , 0, 1, 1, 1, 0, . . . , where the sets of triples (1,1,1) are 
equally spaced. Once again, M = 3000, a = 0.05, Ci = ^>-^(l - a/2M), 
Ci = 0.77<&~^(1 — a/2{M — i + 1)). The step-down procedure in the table 
is based on the difference of two normal variables, each with variance 1. The 
procedure controls FWER at a = 0.05. 

The message in Tables 2 and 3 for two sided alternatives is that MRD has 
slightly higher expected number of type I errors but has many fewer type II 
errors. 

Table 4 gives results for the treatments versus control model with known 
cr^ = 1, for one-sided alternatives. We compare MRD, LRSD, step-down 
based on Dunnett's tests for a = 0.05, call it -D(0.05), step-down based on 
Dunnett's tests for a = 0.2, call it D{0.2), regular step-down (SD) and reg- 
ular step-up (SU). Before commenting on the simulation findings, we make 
some remarks. MRD and LRSD both take dependency into account in two 
ways. Namely, through test statistics and through critical values. -D(0.05) 
and D(0.2) take dependency into account only through critical values, SD 
and SU do not take dependency into account at all. Recall that our pro- 
posal is to sacrifice some FDR control, especially when there are not too 



Table 1 









Comparison 


of MRD and SU procedures 


for treatments 


vs. control, 


variance 


known 
















Expected # of 


Expected # of 










Number of means equal to 






type I 


errors 


type II 


errors 


FDR 


Total 


errors 





-4 


-2 


2 


4 


MRD 


SU 


MRD 


SU 


MRD 


SU 


MRD 


SU 


10000 














0.67 


28 








0.05 


0.02 


0.67 


28 


9200 





800 








13.02 


24.03 


560.32 


726.5 


0.05 


0.02 


573.34 


750.52 


9200 


800 











12.23 


58.77 


5.99 


131.18 


0.02 


0.04 


18.22 


189.96 


8400 





800 


800 





11.2 


40.32 


1041.91 


1463.22 


0.02 


0.03 


1053.11 


1503.54 


8400 








1600 





16.06 


43.45 


1205.59 


1392.09 


0.04 


0.02 


1221.65 


1435.54 


8400 


800 





800 





12.78 


55.09 


557.82 


730.51 


0.01 


0.03 


570.60 


785.6 


8400 








800 


800 


12.95 


34.40 


563.96 


752.64 


0.01 


0.03 


576.91 


787.04 


8400 


800 








800 


13.28 


73.65 


12.37 


148.81 


0.01 


0.04 


25.65 


222.45 


8400 











1600 


13.46 


70.82 


12.56 


167.88 


0.01 


0.04 


26.02 


238.7 


7600 





800 


1600 





12.17 


55.13 


1602.92 


2121.25 


0.02 


0.03 


1614.47 


2176.37 


7600 








2400 





24.95 


59.77 


1943.43 


2000.7 


0.05 


0.03 


1968.37 


2060.47 


7600 


800 





1600 





16.17 


57.67 


1191.87 


1313.02 


0.01 


0.03 


1208.05 


1370.7 


7600 








1600 


800 


16.41 


58.33 


1202.26 


1326.52 


0.01 


0.03 


1218.66 


1384.85 


7600 


800 





800 


800 


14.32 


85.26 


562.51 


718.44 


0.01 


0.03 


576.83 


803.7 


7600 








800 


1600 


14.56 


69.92 


569.23 


758.13 


0.01 


0.04 


583.8 


828.05 


7600 


800 








1600 


14.73 


95.19 


19.79 


160.22 


0.01 


0.03 


34.52 


255.4 


7600 











2400 


15.58 


116.56 


21.17 


218.25 


0.01 


0.04 


36.76 


334.82 



Table 2 

Comparison of MRD and SD procedures for treatments vs. control, variance unknown 



Number of noncentrality 






Expected # of 


Expected # of 










parameters equal to the value 






type I 


errors 


type II 


errors 


FDR 




Total 


errors 





-3 


-1 


1 


3 


MRD 


SD 


MRD 


SD 


MRD 


SD 


MRD 


SD 


3000 














0.18 


0.09 








0.02 


0.02 


0.18 


0.09 


2800 





200 








1.22 


0.04 


198.62 


199.93 


0.05 


0.02 


199.85 


199.96 


2800 


200 











7.33 


0.07 


22.7 


180.9 


0.04 


0.01 


30.03 


180.97 


2600 





200 


200 





2.51 


0.04 


992.62 


399.78 


0.07 


0.01 


395.13 


399.82 


2600 








400 





1.7 


0.04 


396.81 


399.79 


0.05 


0.01 


398.51 


399.83 


2600 


200 





200 





6.5 


0.03 


211.3 


379.14 


0.03 





217.81 


379.17 


2600 








200 


200 


7.02 


0.04 


218.62 


381.31 


0.04 


0.01 


225.64 


381.35 


2600 


200 








200 


4.31 


0.07 


58.43 


361.18 


0.01 





62.75 


361.25 


2600 











400 


4.92 


0.04 


59.52 


362.74 


0.01 


0.01 


64.43 


362.78 


2400 





200 


400 





2.82 


0.05 


587.53 


599.69 


0.06 


0.01 


590.35 


599.74 


2400 








600 





1.74 


0.05 


596.39 


599.66 


0.05 


0.01 


598.14 


599.71 


2400 


200 





400 





6.2 


0.02 


403.79 


580.5 


0.03 





409.99 


580.52 


2400 








400 


200 


7.07 


0.02 


417.96 


581.49 


0.04 


0.01 


425.03 


581.51 


2400 


200 





200 


200 


3.88 


0.03 


254.02 


562.3 


0.01 





257.9 


562.32 


2400 








200 


400 


5.01 


0.02 


262.95 


562.79 


0.02 


0.01 


267.96 


562.81 


2400 


200 








400 


2.91 


0.05 


110.07 


541.56 


0.01 





112.98 


541.61 


2400 











600 


3.96 


0.03 


110.39 


543.14 


0.01 


0.01 


114.36 


543.17 
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Table 3 

Comparison of MRD and SD procedures for the change point model 







Expected # of 


Expected # of 










Number of 


type I 


errors 


type II 


errors 


FDR 


Total 


errors 


nulls 


triples 


MRD 


SD 


MRD 


SD 


MRD 


SD 


MRD 


SD 


3000 








0.05 











0.05 





0.05 


2970 


10 


1.81 


0.04 


21.14 


30 


0.16 


0.04 


22.95 


30.04 


2955 


15 


2.04 


0.06 


31.34 


44.99 


0.13 


0.06 


33.38 


45.05 


2925 


25 


4.27 


0.05 


52.65 


74.98 


0.16 


0.05 


56.93 


75.03 


2850 


50 


8.39 


0.05 


105.26 


149.97 


0.16 


0.05 


113.65 


150.03 


2820 


60 


7.63 


0.06 


125.77 


179.97 


0.12 


0.05 


132.99 


180.03 


2700 


100 


17.61 


0.04 


210.99 


299.95 


0.17 


0.04 


228.6 


299.99 


2550 


150 


27.21 


0.05 


317.71 


449.93 


0.17 


0.04 


344.92 


449.97 


2400 


200 


36.52 


0.04 


423.90 


599.90 


0.17 


0.04 


460.42 


599.95 



many rejections. MRD was recommended when the proportion of false nulls 
is less than 0.25. Also, recall the geometric relationship between MRD and 
LRSD, as noted in Section 5. In light of this, we expect and do observe 
that the performance of MRD and LRSD (in terms of expected number of 
mistakes) is comparable. One advantage that MRD has over LRSD is in 
computation. LRSD requires a package like "quadprog" in R. This program 
is very time consuming for M > 100, which is why the simulation is done 
for M = 100 and not for a larger M. Since -D(0.05) takes dependency into 
account through critical values, that procedure should and does perform 
better than SD, which controls FWER at a = 0.05. It is not fair in a sense 
to compare -D(0.05), a markedly conservative procedure, with MRD and 
LRSD. However, one can compare D{0.2) with MRD and LRSD, and the 
latter two are preferred. 

The simulations are based on 1000 iterations. The largest percentage of 
true alternatives considered is 25. The cirtical values for MRD and LRSD 
are as follows. Let Ci{SD) = <^~^{1 - a/M), Ci{SD) = $~i(l - a/{M - 
i + 1)),1 <i<M, then Ci for LRSD is 1.25Ci{SD) and d for LRSD is 
1.2Ci{SD), for i / 1. For MRD Ci = Ci{SD), d = 0.7Ci{SD), for i / 1. 
The C"s for D(0.2) and -D(0.05) are obtained by simulation. 

Table 4 offers simulations that yield FDR and total errors for each of the 
six procedures. Other simulations yielded expected number of type I errors 
and expected number of type II errors. These are not given in the table, 
because the pattern for type I errors is the same as with FDR, and the 
pattern for type II errors can be discerned from the columns giving total 
errors. 



Table 4 

Comparison of MRD, LRSD, D{0.2), D(0.05), SD and SU for one-sided treatments vs. control 



# of 


means 


equal to 






FDR 












Total 


errors 









2 


4 


MRD 


LRSD 


£»(0.2) 


£)(0.05) 


SD 


SU 


MRD 


LRSD 


D(0.2) 


D(0.05) 


SD 


SU 


100 








0.05 


0.04 


0.19 


0.05 


0.03 


0.04 


0.11 


0.19 


0.95 


0.08 


0.07 


0.63 


95 


5 





0.1 


0.11 


0.11 


0.02 


0.01 


0.02 


4.53 


4.81 


4.99 


4.89 


4.91 


5.19 


95 





5 


0.17 


0.06 


0.08 


0.02 


0.01 


0.04 


1.39 


1.54 


2.46 


3.11 


3.33 


3.40 


90 


5 


5 


0.12 


0.07 


0.05 


0.01 


0.01 


0.03 


4.40 


5.13 


6.77 


7.89 


8.38 


8.23 


90 


10 





0.09 


0.14 


0.09 


0.02 


0.01 


0.04 


8.28 


8.12 


9.18 


9.65 


9.73 


10.06 


90 





10 


0.1 


0.04 


0.05 


0.01 


0.00 


0.04 


1.66 


2.20 


4.29 


6.09 


6.82 


5.93 


85 


5 


10 


0.07 


0.04 


0.04 


0.01 


0.00 


0.04 


4.53 


5.51 


8.24 


10.58 


11.51 


10.02 


85 


10 


5 


0.08 


0.07 


0.05 


0.01 


0.01 


0.04 


7.56 


8.24 


10.96 


12.55 


13.11 


12.70 


80 


15 


5 


0.06 


0.06 


0.04 


0.01 


0.00 


0.04 


10.86 


11.21 


14.81 


17.34 


17.86 


16.75 


80 


5 


15 


0.05 


0.02 


0.02 


0.01 


0.00 


0.03 


4.78 


6.03 


10.00 


13.27 


15.14 


11.82 


80 


10 


10 


0.06 


0.04 


0.03 


0.00 


0.00 


0.03 


7.85 


8.66 


12.7 


15.50 


16.55 


14.63 


80 


20 





0.04 


0.10 


0.05 


0.01 


0.01 


0.03 


15.88 


13.54 


17.31 


19.09 


19.34 


19.00 


80 





20 


0.05 


0.01 


0.02 


0.00 


0.00 


0.03 


2.06 


3.34 


7.50 


11.66 


13.76 


9.09 


75 


5 


20 


0.04 


0.02 


0.02 


0.00 


0.00 


0.03 


5.23 


6.68 


11.49 


16.59 


18.43 


13.06 


75 


20 


5 


0.04 


0.06 


0.03 


0.01 


0.00 


0.03 


14.51 


14.17 


18.95 


22.10 


22.73 


21.06 


75 


15 


10 


0.05 


0.04 


0.03 


0.01 


0.00 


0.03 


11.02 


11.66 


16.55 


20.21 


21.34 


18.62 


75 


10 


15 


0.05 


0.03 


0.02 


0.01 


0.00 


0.03 


8.13 


9.26 


14.01 


18.26 


19.82 


15.81 


75 


25 





0.03 


0.08 


0.04 


0.01 


0.00 


0.03 


20.35 


16.52 


21.73 


23.87 


24.28 


23.63 


75 





25 


0.05 


0.01 


0.02 


0.00 


0.00 


0.03 


2.50 


3.88 


9.14 


14.34 


17.12 


10.59 
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In summary, under the stated conditions, the admissible procedures MRD 
and LRSD have comparable performances with a computational advantage 
for MRD, should M > 100. 

APPENDIX 

Proof of Lemma 3.2. For j 7^ 1, j / ji, ...J / use (2.4) and 

recall g is the first column of S to see that 

C/mj(x + rg) 

= {Xj + raji - (Tj 

= f/„j(x) + [ruji - rcr^.^^'-'^f'"-^)^ (1,0, . . • , 0)']/f^{i-ji,...,i(„_i)) 

— Umj (x) . 

This establishes (3.4). 
Now, 

;7mi(x + rg) 

= t/mi(x) + [rcjii 

(A.2) 

= ?7mi(x) +r, 
which establishes (3.5). □ 

Proof of Lemma 3.3. If (^[/(x*) = then, when x* is observed, the 
process must stop before Hi is rejected. Suppose it stops at stage m with- 
out having rejected Hi. That means that Umj^n < ^m, which is equivalent 
to \Umi\ < Cm for all i G {1, . . . , M} \ ji, . . . ,jm~i, ji / 1- Also Uij-^ > Q, 
i = 1, . . . ,171 — 1, ji ^ 1. Next, consider x* + rgg, which is a reject Hi point. 
By Lemma 3.2, (A.l) and (A.2) imply that only the functions ?7mi(x) 
can change from x* to x* + rog. Also, at some stage h < m from (A.2), 
Ufi^i must have increased to become positive and become the maximum 
function at that stage and also be > Ch- By (A.2), Uh^i will be at least 
this large for all r > tq. Thus, Hi will also be rejected for all x + rg, 
r > ro . □ 

Proof of Theorem 4.1. We prove the theorem for M = 3. For M = 2 
the method is the same and the proof is simpler. In light of Lemma 3.1 and 



ra 



(j'lv j'(m-l))', 
(1) 



in, 



■ J(m-l)) 
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the proof of Theorem 3.1, we need to show that the LRSD test for Hi vs. 
Ki, say </>i(x), as a function of x + 7g, goes from reject to accept to reject as 
7 varies from (— oo, oo). Another way of stating this requirement is, suppose 
(^i(x*) = 1 when > 0. Then, we must have (/>i(x* + 7g) = 1 for all 7 > 0, 
while if 01 (x*) = 1 and x| < 0, we must have <^i(x* — 7g) = 1 for all 7 > 0. 

There are a number of cases that need to be treated. Namely each of the 
three stages at which Hi is rejected. If Hi is rejected at stage 1 at x = x* and 
xj > 0, with xl > 1 and xl > {x^^l, then x*'S~^x* > Ci, and this implies 
that 

(x* + 7g)'S-i(x* + 7g) = x*'S-ix* + 2^x1 + 7^>Ci. 

Also, + 7 > + 7p| and + 7 > jxg +jp\, which means that (j)i{x.*) = 1 
implies (j)i{x* + 7g) = 1 and Hi is rejected at stage 1 for all x* + 7g, all 
7>0. 

The next case to consider is when Hi is rejected at the second stage for 
X = X*. Two subcases are xj > and x^ < 0. For x^ > 0, suppose X3 is out 
first. Then, we find that xj^ + x^^ — 2^x^X2 > C2 and 

ix*i + 7)^ + {x*2 + 7P)^ - 2p{x*i + 7)(x^ + p7) 

(A.3) 

= xf + X2^ - 2px*ix*2 + 27x1 + 7^ + - 2p^xi7 - 2p2^2_ 

But, since 7^ + p^7^ > 2p'^j'^ and 27x1 > 2^^7x1, it follows that (A.3) > C2 
for all 7 > 0. Hence, </>i(x* + 7g) = 1 for all 7 > 0. If x* < 0, a similar 
argument works for (xj — 7)^ + (x2 — 7/9)^ — 2/3(x| — 7)(x2 — 7/j). 

Finally, the third case is when Hi is rejected at stage 3. In subcases where 
the ordering of the components of x* is maintained with (x* +7g), it is easy 
to prove the required monotonicity property. The most challenging subcase 
is if IX3I > X2 > xj > with X3 < but 

(A.4) |x^ + 7p| <x^ + 7p. 

In this case, when p> 0, we use the fact that X3^ > X2^ and use inequalities, 
as in the previous case, to prove the result. When p < 0, we observe that, if 
Ixgl > X2 > x^ > and X3 < 0, then jxg + p7| > |x2 + pj\, and so (A.4) cannot 
happen. It is easy to verify then that if 0i(x*) = 1 then 0i(x* + 7g) = 1 for 
all 7 > 0. Similarly, for xi < 0. □ 

Proof of Theorem 4.2. Once again, we focus on Hi vs. and 
demonstrate that if 0i(x*) = 1, then (j)i{x* +7g) = 1 for all 7 > 0. Suppose 
Hi is rejected at stage m at x = x*. Then, x^^ > x*^ > • • • > > xl > 

X* ,,>•••> x*„ and xl > 0. Note that, at x** = x* + 7g, the orders of all 
coordinates are preserved except perhaps the first coordinate, which now can 
be anywhere among the m largest coordinates. The k stage global hypoth- 
esis is considered if Hj-,^, . . . , Hj^ -^ have been rejected. This global testing 
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problem is Hkc ■ fi^^^^-^^"-^^ = Qin^-^^k-i) yg. Kkc ■ /x^^i'-'Jfe-i) G qUu-Jk-i) ^ 
where Q(n,-,jk-i) = ^^>0,i £ [I, . . . , M]\[ji, . . . ,ifc_i]}\0(-''i'-J'^-i). 

The hkehhood ratio test rejects H^g if x* is observed and if 

_(i/2)/,(ii.-,A-i)'s,.; ^.^ 

(A.5) 

= exp{x*(^i'-'^^-i)'s-^ . a(n,-,Jk-i)* 
where p,^^^'---'^k~i)* ^^i^ maximum hkelihood estimator of ^ih,--;ik-i) ^ when 

X = X*. 

Next, consider the hkehhood ratio test statistic at x**. It is 

supexp{(x*(-^i'-'-''=-i) + 7gOi.-.ife-i))' 
Q 

UlvJfc-l)^ 

>exp{(x*(ji'-'J'=-i))' 

Recognize that the right-hand side of (A. 6) is the maximized hkelihood in 
(A.5) times exp7/iS^'''-'^'=-i^*. Since /jj^i'-'^'^-i)* > o, it follows from (A.5) 
and (A. 6) that (A. 6) > C^, which means that there is a rejection at stage k 
at X** if there was a rejection at stage A; at x*, A; = 1, . . . , M. Since the order 
of the coordinates of Xj-^, . . . ,Xj^_-^ remains unchanged and xl* is among 
the m largest coordinates of x**, it follows that Hi is rejected at stage m or 
sooner. □ 



(A.6) 



Proof of Theorem 6.1. With x,- , eliminated, the covari- 

ance matrix of the remaining variables is the block matrix 



/s(ii-i) 







s(j2-ii-i) 



S(M-j™_i)/ 
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where is given in (6.2). Now, let si = i — jr — 1 and S2 = jV+i — i — 1. 
Then, the residual for Xi (i.e., the numerator of Umi) is 

2;i-(0,... ,0,-1,-1, 0,...,0) 



/S(ii-l) 



(A.7) 







E(.i) 



S(S2) 



V 



E(M-j^_i) / 



X X 



(«JlvJm-l) 



where the row vector above is of order (M — m) x 1 and has two entries of 
— 1 in positions (i — 1) and i. If i = 1 or M, then there is only one entry 
of —1. Thus, only the last row of S~^(si) and first row of S~^(s2) will be 
needed. Specifically, (A.7) can be written as 

2;i + (0,...,0,l/(si + l),2/(si + l),..., 

(A.8) 

si/{si + 1),S2/(S2 + 1), . . . , 1/{S2 + 1),0, . . . ,0)x(*'^i'-'J— 

where the nonzero entries in (A.8) appear in positions jV + Ij • ■ • , jV+i — 2, so 
that the residual depends only on xj^+i, ■ • ■ ,a;j„+i--i- Thus, (A.8) becomes 

Sl S2 

(A.9) Xi + (l/(si + 1)) ^ jrE„.+, + (1/(S2 + 1)) E(^2 - J + l)xi+r 
Since Xj = Zj — Zj+i, (A.9) can be written as 

Si+l S2 + 1 

(l/(.l + 1)) J2 Zjr+J - (1/(^2 + 1)) E 



(A.IO) 



{l/ii-jr)) E ^,-(l/(jr-+l-^)) E 

j=jr + l j = i+l 

[(jr+1 - jr)/ (jr+1 " ^)(« " >)] 



X E ^i-[(i-jv)/(ir+i-jv.)] E 

A similar computation yields the denominator of Umi, namely (^{i-ji,...,jrn^i) 
as 

(A.ll) [{jr+l-jr)/{jr+l-i){i-jr)]'/'. 

Combine (A.IO) and (A.ll) and (6.3) is established. □ 
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